\chapter{Linear Algebra}\label{app:linalg}

Linear algebra concerns vector spaces and linear mappings between them. It is central to robotics as it allows describing positions and speeds of the robot within the world as well as moving parts connected to it, as well as in processing image and depth data, which is often presented in matrix form.

\section{Dot product}\label{app:linalg:dotproduct}

The dot product (or scalar product)\index{Dot product}\index{Scalar product} is the sum of the products of the individual entries of two vectors. Let $hat{a}=(a_1,\ldots,a_n)$ and $\hat{b}=(b_1,\ldots,b_n)$ be two vectors. Then, their dot product $\hat{a}\cdot\hat{b}$ is given by
\begin{equation}
\hat{a}\cdot\hat{b}=\sum_{i}^n a_ib_i
\end{equation}
The dot product therefore takes two sequences of numbers and returns a single scalar.

In robotics, the dot product is mostly relevant due to its geometric interpretation:
\begin{equation}
\hat{a}\cdot\hat{b}=\|\hat{a}\|\|\hat{b}\|\cos\theta
\end{equation}
with $\theta$ the angle between vectors $\hat{a}$ and $\hat{b}$.

If $\hat{a}$ and $\hat{b}$ are orthogonal, it follows $\hat{a}\cdot\hat{b}=0$. If $\hat{a}$ and $\hat{b}$ are parallel, it follows $\hat{a}\cdot\hat{b}=\|\hat{a}\|\|\hat{b}\|$.

\section{Cross product}

The cross product $\hat{a} \times \hat{b}$ of two vectors is defined as a vector $\hat{c}$ that is perpendicular to both $\hat{a}$ and $\hat{b}$. Its direction is given by the right-hand rule and its magnitude is equal to the area of the parallelogram that the vectors span.

Let $\hat{a}=(a_1,a_2,a_3)^T$ and $\hat{b}=(b_1,a_2,a_3)$ be two vectors in $\mathrm{R}^3$. Then, their cross product $\hat{a}\times\hat{b}$ is given by
\begin{equation}
\hat{a}\times\hat{b}=\left(
\begin{array}{l}
a_2b_3-a_3b_2\\
a_3b_1-a_1b_3\\
a_1b_2-a_2b_1
\end{array}
\right)
\end{equation}


\section{Matrix product}
Given an $n \times m$ matrix $\mathbf{A}$ and a $m\times p$ matrix $\mathbf{B}$, the matrix product $\mathbf{AB}$ is defined by
\begin{equation}
(\mathbf{AB})_{ij}=\sum_{k=1}^mA_{ik}B_{kj}
\end{equation}
where the index $ij$ indicates the i-th row and j-th column entry of the resulting $n\times p $ matrix. Each entry therefore consists of the scalar product of the i-th row of $\mathbf{A}$ with the j-th column of $\mathbf{B}$.

Note that for this to work, the right hand matrix (here $\mathbf{B}$) has to have as many columns as the left hand matrix (here $\mathbf{A}$) has rows. Therefore, the operation is not commutative, i.e., $\mathbf{AB}\neq\mathbf{BA}$.

For example, multiplying a 3x3 matrix with a 3x1 matrix (a vector), works as follows:
Let
\begin{equation}\nonumber
\mathbf{A} = \begin{pmatrix}
a & b & c \\
p & q & r \\
u & v & w
\end{pmatrix} \qquad \mathbf{B} = \begin{pmatrix}
x \\
y \\
z
\end{pmatrix}.
\end{equation}

Then their matrix product is:
\begin{equation}\nonumber
\mathbf{AB} = \begin{pmatrix}
a & b & c \\
p & q & r \\
u & v & w
\end{pmatrix} \begin{pmatrix}
x \\
y \\
z
\end{pmatrix} =\begin{pmatrix}
ax + by + cz \\
px + qy + rz \\
ux + vy + wz
\end{pmatrix}
\end{equation}

\section{Matrix inversion}
Given a matrix $\mathbf{A}$, finding the inverse $\mathbf{B}=\mathbf{A}^{-1}$ involves solving the system of equations that satisfies
\begin{equation}
\mathbf{AB}=\mathbf{BA}=\mathbf{I}
\end{equation}
with $\mathbf{I}$ the identity matrix\index{Identify matrix}. (The identity matrix is zero everywhere except at its diagonal entries, which are one.)

In the particular case of orthonormal matrices\index{Orthonormal matrix}, which columns are all orthogonal to each other and of length one, the inverse is equivalent to the transpose, i.e.
\begin{equation}
\mathbf{A}^{-1}=\mathbf{A}^T
\end{equation}
This is important, as rotation matrices are orthonormal.

In case a matrix is not quadratic, we can calculate the pseudo-inverse\index{Pseudo-inverse}, which is defined by
\begin{equation}
\mathbf{A}^+=\mathbf{A}^T(\mathbf{AA}^T)^{-1}
\end{equation}
and is often used in finding an inverse kinematic solution.

\section{Principal Component Analysis}\label{sec:appendix_PCA}
\index{Principal Component Analysis}\index{PCA}
Principal Component Analysis (PCA) breaks n-dimensional data into $n$ vectors so that each data point can be represented by a linear combination of the $n$ vectors. These $n$ vectors have two interesting properties: first, they are ordered by their variance so that the first vector is representative of the data with the highest variation in the data, and second, they are orthogonal. These vectors are therefore called \emph{principal components}. Figure \ref{fig:PCA} shows an example of two-dimensional data and the two principal components. 


\begin{figure}[htb]
    \centering
    \def\svgwidth{0.6\textwidth}
    \import{./figs/}{GaussianScatterPCA.pdf_tex}
    \caption{PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.866, 0.5) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by the square root of the corresponding eigenvalue, and shifted so their tails are at the mean. \copyright Nicoguaro CC BY 4.0.\label{fig:PCA}}
\end{figure}

This approach has a strong geometrical interpretation: the points along the long axis of the rectangle have higher variance than those along the the short axis. Every point in this point cloud can then be reconstructed by a linear combination of the principal component along the long axis and the principal component along the short axis. Finding these vectors is therefore akin finding the principal axes of the rectangle regardless of its orientation.

One can show that the principal components are eigenvectors of the data's covariance matrix. For this, we need to compute the mean and variance of data such as shown in Figure \ref{fig:PCA} across each dimension, shift the data so that it has zero mean, and then calculate the data's covariance matrix. One can also show that the values of the corresponding Eigenvalues are proportional to the importance of each Eigenvector.

More formally, given $N$ data samples $\mathbf{x}_i \in \mathrm{R}^n$, we can compute the entries of the $n \times n$ covariance matrix $\mathbf{C}$ as
\begin{equation}
C_{jk}=\frac{1}{N}\sum{i=1}^N(x_j^i-\mu_j)(x_k^i-\mu_k)
\end{equation}
with $\mu_j$ the mean across the $j-th$ dimension of the data. The Eigenvalues $\lambda$ and Eigenvectors $\mathbf{u}$ are given by

\begin{equation}
\mathbf{C}\mathbf{u}=\lambda \mathbf{u}
\end{equation}
and are equivalent to the principal components of the data. 

While a typical use of PCA is dimensionality reduction of data (by representing the data only using the $n$ first principal components), PCA is highly relevant in point cloud analysis in robotics, for example when finding good grasp locations.  


%%%% Random stuff on rotations
%Why any rotation can be expressed by a single vector can be seen when considering the properties of orthonomal rotation matrices. They have three eigenvalues $\lambda=1$ and a complex pair $\lambda_{1,2}=\cos \theta \pm i \sin \theta$.
%Eigenvalues and eigenvectors are defined as $Rv=\lambda v$. For the case of $\lambda=1$, the corresponding Eigenvector $v$ is unchanged by rotation. This is only possible if $v$ is the actual axis of rotation.
%The angle of rotation is now given by $\theta$, which can be inferred from the complex pair.


